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Abstract.  Software  security  represents  a  major  concern  as  cyber  attacks 
continue  to  grow  in  number  and  sophistication.  One  security-weakening  factor 
is  related  to  the  standardized  software  ecosystem  that  facilitates  the  spread  of  mal¬ 
ware  in  systems  that  share  common  vulnerabilities.  In  this  overview  article,  the  main 
concepts  associated  with  diversity  and  software  redundancy  are  described  in  the 
perspective  of  improving  attack  resistance.  The  remarkable  progress  made  in  this 
area,  where  commercial  implementations  are  now  emerging,  is  also  highlighted. 

1.  Introduction 

The  security  of  information  systems  remains  an  extremely 
critical  issue  despite  the  good  progress  that  was  made  in  the 
last  1 0  years  in  the  fields  of  software  quality  and  system  reli¬ 
ability.  It  seems  that  defensive  measures  cannot  catch  up  to  the 
continuous  growth  of  cyber  threats  that  are  not  only  increasing 
in  number  but  also  in  sophistication  and  scale  [1, 2]. 

It  remains  extremely  difficult  to  produce  fault-free  software 
despite  the  rigorous  quality  controls  that  are  generally  part  of 
the  software  development  process.  These  residual  faults  consti¬ 
tute  dormant  vulnerabilities  that  would  eventually  end  up  being 
discovered  by  malicious  attackers  and  exploited  to  carry  out  cy¬ 
ber  attacks.  Moreover,  in  order  to  ease  the  system  management, 
reduce  the  configuration  errors,  and  achieve  portability,  most  of 
the  systems  used  nowadays  run  substantially  similar  software. 
This  is  called  information  technology  monoculture  [3,  4].  As  a 
consequence,  these  systems  share  similar  vulnerabilities  that 
facilitate  malware  propagation  and  enable  large-scale  exploita¬ 
tion  of  these  common  vulnerabilities. 

The  Canadian  Forces  like  most  armed  forces  around  the 
world,  are  very  much  concerned  by  the  “conjugation  of  these 
risks  factors”:  the  increased  threat  capabilities  aimed  at  vulner¬ 
able  infrastructures  combined  with  society’s  dependency  on  in¬ 
formation  sharing.  Recognizing  that  cyber  attacks  are  inevitable 
in  the  future,  a  shift  from  the  traditional  defensive  strategies 
toward  more  proactive  measures  can  now  be  observed.  This  in¬ 
cludes:  (a)  more  rigorous  monitoring  for  earlier  attack  detection; 
(b)  the  capture  of  legal  evidence  (cyber  forensics)  to  enable 
post-event  investigation;  (c)  some  semi-automated  responses  to 
the  most  likely  attacks;  and  (d)  pre-programmed  recovery  strate¬ 
gies  to  minimize  the  impact  of  successful  attacks. 

Among  the  technologies  that  have  the  potential  of  mitigat¬ 
ing  the  cyber  attack  risks,  “software  redundancy”  that  includes 
“component  diversity”  appears  to  be  one  of  the  rare  technologies 
promising  an  order-of-magnitude  increase  in  system  security.  The 


basic  idea  is  simply  to  have  critical  systems  implemented  in  two 
(or  more)  instances  using  sufficiently  different  sub-systems  (e.g., 
Linux  and  Unix  BSD)  so  the  same  dormant  vulnerability  does 
not  exist  in  both  redundant  systems,  making  it  impossible  for  the 
attackers  to  exploit  the  same  vulnerability  in  both  instances  simul¬ 
taneously.  Not  only  does  such  architecture  offer  attack  resistance, 
it  also  greatly  improves  the  monitoring  of  transactions  and  the 
early  detection  of  abnormal  behavior  by  the  comparison  of  both 
executions.  It  also  enables  continuity  of  services  since  the  replica 
can  handle  the  user’s  requests  while  the  first  system  is  targeted, 
investigated,  or  recovering  from  a  recent  attack. 

In  2008,  Defence  R&D  Canada  initiated  a  study  to  evalu¬ 
ate  the  state-of-the-art  in  software  redundancy  implementing 
technological  diversity  to  mitigate  the  risk  associated  with  the  IT 
monoculture.  The  amount  of  high-quality  work  that  is  going  on 
in  the  scientific  community  is  impressive.  This  short  article  gives 
an  overview  of  the  state-of-the-art  in  system  redundancy  using 
different  types  of  diversity  paradigms. 

2.  Redundancy  and  Diversity  Combined 
in  a  Defense  Mechanism 

Redundancy  is  traditionally  used  to  achieve  fault  tolerance 
and  higher  system  reliability.  This  has  proven  to  be  valid  mainly 
for  hardware  because  of  the  failure  independence  assumption 
as  hardware  failures  are  typically  due  to  random  faults.  There¬ 
fore,  the  replication  of  components  provides  added  assur¬ 
ance.  When  it  comes  to  software,  however,  failures  are  due  to 
design  and/or  implementation  faults.  As  a  result,  such  faults 
are  embedded  within  the  software  and  their  manifestation  is 
systematic.  Therefore,  redundancy  alone  is  not  effective  against 
software  faults. 

Faults  embedded  in  software  represent  potential  vulnerabili¬ 
ties,  which  can  be  exploited  by  external  interactive  malicious 
fault  (i.e.,  attacks)  [5].  These  attacks  can  ultimately  enable  the 
violation  of  the  system  security  property  (i.e.,  security  failure) 

[5].  Therefore  the  diversity  principle  can  potentially  be  used  for 
security  purposes.  First,  diversity  can  be  used  to  decrease  the 
common  vulnerabilities.  This  is  achieved  by  building  a  soft¬ 
ware  system  out  of  a  set  of  diverse  but  functionally  equivalent 
components.  This  in  turns  makes  it  very  difficult  for  a  malicious 
opponent  to  be  able  to  break  into  a  system  with  the  very  same 
attack.  Second,  the  ability  to  build  a  system  out  of  redundant 
and  diverse  components  provides  an  opportunity  to  monitor 
the  system  by  comparing  the  dynamic  behavior  of  the  diverse 
components  when  presented  with  the  same  input.  This  endows 
the  system  with  efficient  intrusion  detection  capability. 

Therefore,  diversity  has  naturally  caught  the  attention  of  the 
software  security  research  community.  The  seminal  work  pre¬ 
sented  by  Forrest  et  al.  [6]  promotes  the  general  philosophy  of 
system  security  using  diversity.  The  authors  argue  that  uniformity 
represents  a  potential  weakness  because  any  flaw  or  vulnerabil¬ 
ity  in  an  application  is  replicated  on  many  machines.  The  security 
and  the  robustness  of  a  system  can  be  enhanced  through  the 
deliberate  introduction  of  diversity.  Deswarte  et  al.  review  [7]  the 
different  levels  of  diversity  of  software  and  hardware  systems 
and  distinguish  different  dimensions  and  different  degrees  of 
diversity  [8].  Bain  et  al.  [9]  presented  a  study  to  understand  the 
effects  of  diversity  on  the  survivability  of  systems  faced  with  a 
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set  of  widespread  computer  attacks  including  the  Morris  worm, 
Melissa  virus,  and  LoveLetter  worm.  Ammann  et  al.  [1 0]  report 
on  a  discussion  held  by  a  panel  of  renowned  researchers  about 
the  use  of  diversity  as  a  strategy  for  computer  security  and  the 
main  open  issues  requiring  further  research.  It  emerges  from 
this  discussion  that  there  is  a  lack  of  quantitative  information  on 
the  cost  associated  with  diversity-based  solutions  and  a  lack  of 
knowledge  about  the  extent  of  protection  provided  by  diversity. 

Three  main  levels  of  security  enhancements  based  on  diversity 
and  redundancy  can  be  distinguished:  first  at  the  architecture 
level,  where  replicas  of  critical  sub-systems  are  introduced  to 
maintain  service  delivery  even  when  one  sub-system  fails;  second 
at  the  code  level,  where  some  program  transformations  are  made 
to  diversify  replica;  and  finally  a  fully  monitored  combination  of  di¬ 
versified  components  cleverly  assembled  in  a  secure  architecture. 
In  the  sequel,  these  approaches  are  discussed  further. 

3.  Redundancy  Obtained  by  Multiple  Instances 
Running  in  Parallel 

Two  categories  of  software  architectures  implementing  redun¬ 
dancy  can  be  distinguished.  The  first  category  uses  a  proxy  to 
coordinate  multiple  COTS  applications  while  the  second  one  uses 
a  middleware  to  achieve  the  same  purpose.  Noticeably,  some  com¬ 
mercial  products  implementing  such  strategies  can  now  be  found 
on  the  market  like  everRun  for  Windows  by  Marathon  Technologies. 

3.1  Multiple  COTS  Applications  Coordinated  by  a  Proxy 

The  software  architectures  described  in  this  section  imple¬ 
ment  the  architectural  pattern  depicted  in  Figure  1.  This  ap¬ 
proach  is  ideal  for  a  system  integration  of  COTS  components 
or  legacy  and  closed  applications  aiming  to  deliver  the  services. 
The  servers  are  shielded  from  the  user  side  through  proxies. 
Monitoring  and  voting  mechanisms  are  used  to  check  the  health 
of  the  system,  validate  the  results,  and  detect  abnormal  behavior. 
Examples  of  this  approach  include  the  Dependable  Intrusion 
Tolerance  architecture  [11,1 2],  the  Scalable  Intrusion  Tolerant 
Architecture  [13],  and  Hierarchical  Adaptive  Control  for  QoS 
Intrusion  Tolerance  (HACQIT)  [14]. 

3.2  Multiple  Applications  Assembled  Through  Middleware 

Middleware-based  approaches  are  much  richer  since  they  can 

provide  server  coordination  between  multiple  “diverse”  applica¬ 
tions  while  hiding  the  sub-system  differences  [1 5].  Several  intru¬ 
sion  tolerant  software  architectures  are  part  of  this  category. 

The  Intrusion  Tolerance  by  Unpredictable  Adaptation  archi¬ 
tecture  is  a  distributed  object  framework  that  integrates  several 
mechanisms  to  enable  the  defense  of  critical  applications  [1 6]. 
The  objective  of  this  architecture  is  to  enable  the  tolerance  of 
sophisticated  attacks  aimed  at  corrupting  a  system. 

Malicious  and  Accidental  Fault  Tolerance  for  Internet  Ap¬ 
plications  [17]  is  a  European  research  project  that  targeted  the 
objective  of  systematically  investigating  the  tolerance  paradigm 
in  order  to  build  large-scale  dependable  distributed  applications. 

The  Designing  Protection  and  Adaptation  Into  a  Survivability 
Architecture  [18,  19]  is  a  survivability  architecture  providing  a 
diverse  set  of  defense  mechanisms.  This  architecture  diversity 
is  used  to  achieve  a  defense  in  depth  and  a  multi-layer  secu¬ 
rity  approach  [19].  This  architecture  relies  on  a  robust  network 


Figure  1:  General  Architectural  Pattern  of  Intrusion  Tolerance 


infrastructure  that  supports  redundancy  and  provides  security 
services  such  as  packet  filtering,  source  authentication,  link-level 
encryption,  and  network  anomaly  sensors.  The  detection  of 
violations  “triggers”  defensive  responses  provided  by  middleware 
components  in  the  architecture. 

Fault/instrusiOn  REmoVal  through  Evolution  and  Recovery 
(FOREVER)  [20]  is  a  service  that  is  used  to  enhance  the  resilience 
of  intrusion-tolerant  replicated  systems.  FOREVER  achieves  this 
goal  through  the  combination  of  recovery  and  evolution.  FOREVER 
allows  a  system  to  recover  from  malicious  attacks  or  faults  using 
time-triggered  or  event-triggered  periodic  recoveries. 

4.  Diversity  Obtained  by  Program  Transformations 

Diversity  can  be  introduced  in  the  software  ecosystem  by 
applying  automatic  program  transformations,  which  preserve  the 
functional  behavior  and  the  programming  language  semantics. 
They  consist  essentially  in  randomization  of  the  code,  the  ad¬ 
dress  space  layout  or  both  in  order  to  provide  a  probabilistic 
defense  against  unknown  threats.  Three  main  techniques  can 
be  used  to  randomize  software: 

Instruction  Set  Randomization  (ISR)  [21, 22]  changes  the 
instruction  set  of  the  processor  so  that  unauthorized  code 
will  not  run  successfully.  The  main  idea  underlying  ISR  is  to 
decrease  the  attacker’s  knowledge  about  the  language  used  by 
the  runtime  environment  on  which  the  target  application  runs. 
ISR  techniques  aim  at  defending  against  code  injection  attacks, 
which  consist  of  introducing  executable  code  within  the  address 
space  of  a  target  process,  and  then  passing  the  control  to  the 
injected  code.  Code  injection  attacks  can  succeed  when  the 
injected  code  is  compatible  with  the  execution  environment. 

Address  Space  Randomization  (ASR)  [23]  is  used  to  increase 
software  resistance  to  memory  corruption  attacks.  These  are 
designed  to  exploit  memory  manipulation  vulnerabilities  such  as 
stack  and  heap  overflows  and  underflows,  format  string  vulner¬ 
abilities,  array  index  overflows,  and  uninitialized  variables.  ASR 
consists  basically  of  randomizing  the  different  regions  of  the 
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process  address  space  such  as  the  stack  and  the  heap.  Notice¬ 
ably,  ASR  has  been  integrated  into  the  default  configuration  of 
the  Windows  Vista  operating  system  [24]. 

Data  Space  Randomization  (DSR)  is  a  different  random¬ 
ization-based  approach  which  aims  also  at  defending  against 
memory  error  exploits  [25].  In  particular,  DSR  randomizes  the 
representation  of  data  objects.  This  is  often  implemented  by 
applying  a  modification  to  the  data  representation,  such  as  us¬ 
ing  an  Exclusive  Or  operation  for  each  data  object  in  memory 
against  randomly  chosen  mask  values.  The  data  are  unmasked 
right  before  being  used.  This  makes  the  results  of  using  the 
corrupted  data  highly  unpredictable.  The  DSR  technique  seems 
to  have  advantages  over  ASR,  as  it  provides  a  broader  range  of 
randomization  (on  32-bit  architectures,  integers  and  pointers  are 
randomized  over  a  range  of  232  values).  In  addition,  DSR  is  able 
to  randomize  the  relative  distance  between  two  data  objects, 
addressing  a  weakness  of  the  ASR  technique. 

5. Higher  Resistance  Obtained  by  Combining 
Redundancy  and  Diversity 

The  ability  to  build  a  system  combining  redundant  and  diverse 
components  provides  new  powerful  capabilities.  One  of  them  is 
the  advanced  monitoring  of  the  redundant  system  by  compar¬ 
ing  the  behavior  of  the  diverse  replicas.  This  endows  the  system 
with  efficient  intrusion  detection  capabilities  not  achievable  with 
standard  intrusion  detection  techniques  based  on  signatures 
or  malware  modeling.  Moreover,  with  the  introduction  of  some 
assessment  of  the  behavioral  advantages  of  one  implementa¬ 
tion  over  the  others,  a  “meta-controller”  can  ultimately  adapt  the 
system  behavior  or  its  structure  over  time.  These  futurist  con¬ 
cepts  are  now  prototyped  in  several  projects  like  those  briefly 
described  below. 

5.1  Intrusion  Detection  using  Output  Voting 

Several  experimental  systems  used  output  voting  for  the  sake 
of  detecting  some  types  of  server  compromission.  For  example, 
the  HACQIT  system  [11]  uses  the  status  codes  of  the  server 
replica  responses.  If  the  status  codes  are  different  the  system 
detects  a  failure.  Totel  et  al.  [26]  extend  this  work  to  do  a  more 
detailed  comparison  of  the  replica  responses.  They  realized 
that  web  server  responses  may  be  slightly  different  even  when 
there  is  no  attack,  and  proposed  a  detection  algorithm  to  detect 
intrusions  with  a  higher  accuracy  (lower  false  alarm  rate).  These 
research  initiatives  specifically  target  web  servers  and  analyze 
only  server  responses.  Consequently,  they  cannot  consistently 
detect  compromised  replicas. 

5.2  Behavior  Monitoring  in  N-Variant  Systems 

N-variant  systems  provide  a  framework  that  allows  execut¬ 
ing  a  set  of  automatically  diversified  variants  using  the  same 
inputs  [27].  The  framework  monitors  the  behavior  of  the  variants 
in  order  to  detect  divergences.  The  variants  are  built  so  that 
an  anticipated  type  of  exploit  can  succeed  on  only  one  variant. 
Therefore,  such  exploits  become  detectable.  Building  the  vari¬ 
ants  requires  a  special  compiler  or  a  binary  rewriter.  Moreover, 
this  framework  detects  only  anticipated  types  of  exploits,  against 
which  the  replicas  are  diversified. 


5.3  Multi-variant  Execution  Environment 

Multi-variant  code  execution  is  a  runtime  monitoring  tech¬ 
nique  that  prevents  malicious  code  execution  [28].  This  tech¬ 
nique  uses  diversity  to  protect  against  malicious  code  injection 
attacks.  This  is  achieved  by  running  several  slightly  different 
variants  of  the  same  program  in  lockstep.  The  behavior  of  the 
variants  is  compared  at  synchronization  points,  which  are  in 
general  system  calls.  Any  divergence  in  behavior  is  suggestive 
of  an  anomaly  and  raises  an  alarm. 

5.4  Behavioral  Distance 

The  behavioral  distance  approach  aims  at  detecting  sophis¬ 
ticated  attacks  which  manage  to  emulate  the  original  system 
behavior  including  returning  the  correct  service  response  (also 
known  as  mimicry  attacks).  These  attacks  are  thus  able  to 
defeat  traditional  anomaly-based  intrusion  detection  systems. 
Behavioral  distance  achieves  this  defense  using  a  comparison 
between  the  behaviors  of  two  diverse  processes  running  the 
same  input.  It  measures  the  extent  to  which  the  two  processes 
behave  differently.  Gao  et  al.  proposed  two  approaches  to  com¬ 
pute  such  measures  [29,  30]. 

6.  Concluding  Remarks 

A  few  modern  operating  systems  integrate  some  level  of 
diversity  to  improve  internal  security  and  a  few  COTS  packages 
are  emerging  that  implement  redundancy  extension  into  tradi¬ 
tional  architectures.  It  seems  that  system  architects  should  now 
consider  more  systematically  redundancy  or  component  diversity 
for  critical  systems  that  are  operated  in  hostile  environments. 

In  many  instances,  the  cost  of  security  failures  may  well  justify 
the  additional  complexity  and  the  associated  deployment  and 
operating  costs.  The  exploitation  of  both  features  simultaneously 
remains  mostly  experimental  at  this  time  but  the  very  strong 
promises  that  such  architectures  make  will  continue  to  justify 
research  and  development  in  this  field. 
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